5 research outputs found

    BP-NTT: Fast and Compact in-SRAM Number Theoretic Transform with Bit-Parallel Modular Multiplication

    Full text link
    Number Theoretic Transform (NTT) is an essential mathematical tool for computing polynomial multiplication in promising lattice-based cryptography. However, costly division operations and complex data dependencies make efficient and flexible hardware design to be challenging, especially on resource-constrained edge devices. Existing approaches either focus on only limited parameter settings or impose substantial hardware overhead. In this paper, we introduce a hardware-algorithm methodology to efficiently accelerate NTT in various settings using in-cache computing. By leveraging an optimized bit-parallel modular multiplication and introducing costless shift operations, our proposed solution provides up to 29x higher throughput-per-area and 2.8-100x better throughput-per-area-per-joule compared to the state-of-the-art.Comment: This work is accepted to the 60th Design Automation Conference (DAC), 202

    BioHD: An Efficient Genome Sequence Search Platform Using HyperDimensional Memorization

    No full text
    In this paper, we propose BioHD, a novel genomic sequence searching platform based on Hyper-Dimensional Computing (HDC) for hardware-friendly computation. BioHD transforms inherent sequential processes of genome matching to highly-parallelizable computation tasks. We exploit HDC memorization to encode and represent the genome sequences using high-dimensional vectors. Then, it combines the genome sequences to generate an HDC reference library. During the sequence searching, BioHD performs exact or approximate similarity check of an encoded query with the HDC reference library. Our framework simplifes the required sequence matching operations while introducing a statistical model to control the alignment quality. To get actual advantage from BioHD inherent robustness and parallelism, we design a processing in-memory (PIM) architecture with massive parallelism and compatible with the existing crossbar memory. Our PIM architecture supports all essential BioHD operations natively in memory with minimal modifcation on the array. We evaluate BioHD accuracy and efciency on a wide range of genomics data, including COVID-19 databases. Our results indicate that PIM provides 102.8× and 116.1× (9.3× and 13.2×) speedup and energy efciency compared to the state-of-theart pattern matching algorithm running on GeForce RTX 3060 Ti GPU (state-of-the-art PIM accelerator). © 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM

    Neural Computation for Robust and Holographic Face Detection

    No full text
    Face detection is an essential component of many tasks in computer vision with several applications. However, existing deep learning solutions are significantly slow and inefficient to enable face detection on embedded platforms. In this paper, we propose HDFace, a novel framework for highly efficient and robust face detection. HDFace exploits HyperDimensional Computing (HDC) as a neurally-inspired computational paradigm that mimics important brain functionalities towards high-efficiency and noise-tolerant computation. We first develop a novel technique that enables HDC to perform stochastic arithmetic computations over binary hypervectors. Next, we expand these arithmetic for efficient and robust processing of feature extraction algorithms in hyperspace. Finally, we develop an adaptive hyperdimensional classification algorithm for effective and robust face detection. We evaluate the effectiveness of HDFace on large-scale emotion detection and face detection applications. Our results indicate that HDFace provides, on average, 6.1X (4.6X) speedup and 3.0X (12.1X) energy efficiency as compared to neural networks running on CPU (FPGA), respectively. © 2022 Owner/Author
    corecore